Humans demonstrate a variety of interesting behavioral characteristics when performing tasks, such as selecting between seemingly equivalent optimal actions, performing recovery actions when deviating from the optimal trajectory, or moderating actions in response to sensed risks. However, imitation learning, which attempts to teach robots to perform these same tasks from observations of human demonstrations, often fails to capture such behavior. Specifically, commonly used learning algorithms embody inherent contradictions between the learning assumptions (e.g., single optimal action) and actual human behavior (e.g., multiple optimal actions), thereby limiting robot generalizability, applicability, and demonstration feasibility. To address this, this paper proposes designing imitation learning algorithms with a focus on utilizing human behavioral characteristics, thereby embodying principles for capturing and exploiting actual demonstrator behavioral characteristics. This paper presents the first imitation learning framework, Bayesian Disturbance Injection (BDI), that typifies human behavioral characteristics by incorporating model flexibility, robustification, and risk sensitivity. Bayesian inference is used to learn flexible non-parametric multi-action policies, while simultaneously robustifying policies by injecting risk-sensitive disturbances to induce human recovery action and ensuring demonstration feasibility. Our method is evaluated through risk-sensitive simulations and real-robot experiments (e.g., table-sweep task, shaft-reach task and shaft-insertion task) using the UR5e 6-DOF robotic arm, to demonstrate the improved characterisation of behavior. Results show significant improvement in task performance, through improved flexibility, robustness as well as demonstration feasibility.
translated by 谷歌翻译
Scenarios requiring humans to choose from multiple seemingly optimal actions are commonplace, however standard imitation learning often fails to capture this behavior. Instead, an over-reliance on replicating expert actions induces inflexible and unstable policies, leading to poor generalizability in an application. To address the problem, this paper presents the first imitation learning framework that incorporates Bayesian variational inference for learning flexible non-parametric multi-action policies, while simultaneously robustifying the policies against sources of error, by introducing and optimizing disturbances to create a richer demonstration dataset. This combinatorial approach forces the policy to adapt to challenging situations, enabling stable multi-action policies to be learned efficiently. The effectiveness of our proposed method is evaluated through simulations and real-robot experiments for a table-sweep task using the UR3 6-DOF robotic arm. Results show that, through improved flexibility and robustness, the learning performance and control safety are better than comparison methods.
translated by 谷歌翻译
This study proposes novel control methods that lower impact force by preemptive movement and smoothly transition to conventional contact impedance control. These suggested techniques are for force control-based robots and position/velocity control-based robots, respectively. Strong impact forces have a negative influence on multiple robotic tasks. Recently, preemptive impact reduction techniques that expand conventional contact impedance control by using proximity sensors have been examined. However, a seamless transition from impact reduction to contact impedance control has not yet been accomplished. The proposed methods utilize a serial combined impedance control framework to solve this problem. The preemptive impact reduction feature can be added to the already implemented impedance controller because the parameter design is divided into impact reduction and contact impedance control. There is no undesirable contact force during the transition. Furthermore, even though the preemptive impact reduction employs a crude optical proximity sensor, the influence of reflectance is minimized using a virtual viscous force. Analyses and real-world experiments confirm these benefits.
translated by 谷歌翻译
translated by 谷歌翻译
translated by 谷歌翻译
将差异化随机梯度下降(DPSGD)应用于培训现代大规模神经网络(例如基于变压器的模型)是一项艰巨的任务,因为在每个迭代尺度上添加了噪声的幅度,都具有模型维度,从而阻碍了学习能力显著地。我们提出了一个统一的框架,即$ \ textsf {lsg} $,该框架充分利用了神经网络的低级别和稀疏结构,以减少梯度更新的维度,从而减轻DPSGD的负面影响。首先使用一对低级矩阵近似梯度更新。然后,一种新颖的策略用于稀疏梯度,从而导致低维,较少的嘈杂更新,这些更新尚未保留神经网络的性能。关于自然语言处理和计算机视觉任务的经验评估表明,我们的方法的表现优于其他最先进的基线。
translated by 谷歌翻译
translated by 谷歌翻译
如今,为了改善服务和城市地区的宜居性,全世界正在进行多个智能城市计划。 SmartSantander是西班牙桑坦德市的一个智能城市项目,该项目依靠无线传感器网络技术在城市内部部署异质传感器,以测量多个参数,包括户外停车信息。在本文中,我们使用SmartSantander的300多个户外停车传感器的历史数据研究了停车场可用性的预测。我们设计了一个图形模型,以捕获停车场的定期波动和地理位置。为了开发和评估我们的模型,我们使用了桑坦德市的3年停车场可用性数据集。与现有的序列到序列模型相比,我们的模型具有很高的精度,该模型足够准确,可以在城市提供停车信息服务。我们将模型应用于智能手机应用程序,以被公民和游客广泛使用。
translated by 谷歌翻译
translated by 谷歌翻译
translated by 谷歌翻译